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ABSTRACT 

This  paper  Introduces  a new  class  of  feasible  direction  methods  for 
solving  differentiable  convex  programs  with  nonlinear  convex  constraints. 
Unlike  many  presently  used  methods , the  ones  introduced  here  are  not  based 
on  the  Frit*  John  or  the  Kuhn-Tucker  theory  but  rather  on  two  recent  char- 
acterisations of  optimality  without  a constraint  qualification.  The  new 
methods  are  capable  of  generating  feasible  directions  of  descent  along  the 
boundary  of  the  feasible  set  and  they  consistently  give  directions  of 
steeper  descent  than  many  popular  methods.  This  is  achieved  by 
solving  only  one  linear  program  at  each  iteration.  The  new  methods  are 
particularly  useful  in  solving  large  sparse  convex  programs;  some  of  the 
programs  tested  had  100  variables  and  50  nonlinear  constraints.  Moreover, 
the  new  methods  are  applicable  whether  or  not  Slater's  condition  or  any 
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INTRODUCTION 


Many  numerical  methods  for  solving  differentiable  convex  programs  with 
nonlinear  constraints  are  based  on  the  necessary  conditions  for  optimality 


such  as  the  Fritz  John  condition  (e.g.  [ 14) ) - In  the  presence  of  Slater's 


constraint  qualification  the  Fritz  John  condition  is  equivalent  to  the  Kuhn 


Tucker  condition  (e.g.  I 3 ] [14])  and  it  characterizes  the  optimality  of  a 


feasible  point.  It  is  well  known  that  many  popular  feasible  direction  methods 
(e.g.  [13],  123],  (241,  [2 5) ) produce  rather  slow  convergence  when  applied  to 


convex  programs  with  sparse  nonlinear  constraints.  The  main  reason  Is  that 


such  methods  generate  at  each  iteration  a direction  of  decrease  always  poln 


ting  in  the  interior  of  the  feasible  set  and  therefore  a movement  along  the 


boundary  of  the  feasible  set  is  excluded.  (See  Exanmple  2 below.) 


This  paper  introduces  a new  class  of  feasible  direction  methods.  These 


methods  are  based  on  two  different  complete  characterizations  of  optimality 
for  convex  programs  stated  in  ( 1 ] and  [3  ].  Some  of  the  properties  of  the 


new  methods  are  that  they  are  capable  of  generating  feasible  directions  of 


decrease  along  the  boundary  of  the  feasible  region  and  that  they  consistent' 


ly  give  a feasible  direction  of  steeper  descent  than  many  Zoutendijk-type 


methods.  The  latter  is  achieved  by  solving  only  one  linear  program  at  each 


iteration.  The  fact  that  the  new  methods  allow  movement  along  the  boundary 


of  the  feasible  region  is  particularly  useful  in  the  case  of  programs  with 


sparse  nonlinear  constraints  because  the  feasible  sets  of  such  programs 


normally  have  "flat"  parts.  A movement  along  the  flat  parts  can  signifi 


cantly  speed  up  the  convergence.  Another  important  property  of  the  new 


methods  is  that  they  can  solve  large  sparse  programs.  Several  of  the  pro 
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grams  tested  had  ICO  variables  and  50  nonlinear  constraints.  Some  of  the 
methods  introduced  here  are  also  applicable  to  convex  programs  whether  or 
not  Slater's  condition  or  any  other  constraint  qualification  is  satisfied. 

This  proves  useful  in  the  numerical  treatment  of  programs  arising  in  multi- 
criteria decision  making  processess  because,  as  is  well-known  (e.g.  [2], 

(17)),  such  programs  generally  lack  Slater's  condition. 

Section  2 provides  a motivation  for  the  paper.  In  particular  it  is 
shown  that  the  feasible  direction  methods  based  on  the  necessary  conditions 
of  optimality  generally  terminate  at  a nonoptimal  point,  if  Slater's  con- 
dition does  not  hold  (see  Example  1),  and  also  that  they  point  towards  the 
interior  of  the  feasible  set,  if  Slater's  condition  does  hold  (see  Example  2), 
Theoretical  background  for  the  "Method  of  Elimination  of  Linear  Programs"  is 
given  in  Section  3.  The  method  at  every  iteration  generates  several  direction 
of  decrease  and  then,  among  them,  it  chooses  one  which  is  locally  of  the  stee- 
pest descent.  This  method,  studied  in  Section  A,  is  particularly  useful  for 
convex  programs  with  sparse  constraints  which  are  strictly  convex  in  their 
"actual  variables".  It  can  solve  problems  regardless  of  a constraint  quali- 
fication assumption.  Section  5 introduces  a fundamentally  different  type  of 
numerical  method  based  on  a parametric  characterization  of  optimality  given 
in  ( 1).  The  method  is  formulated  for  a rather  large  class  of  convex  programs, 
namely  those  having  "faithfully  convex"  constraints  and  in  the  presence  of 
Slater's  condition.  An  overall  computational  experience  and  two  solved  test 
programs  are  given  in  Section  6. 


2.  MOTIVATION 


L«t  f°,fl,...,fp  be  functions:  Rn  ■*  R.  Consider  the  mathematical 
programming  problem 

Min  f°(x) 

(P)  «.t. 

fk(x)  sO,  ke  pi  {1,2, ...,p} 

It  Is  assumed  that  an  optimal  solution  x* 
exists  and  that  the  functions  (fk:  k e (0)  u P)  are  convex  and  diffe- 
rentiable. Associated  with  problem  (P)  Is  the  feasible  set 

P - (x  • Rn:  fk(x)  10,  k e P>. 

For  x e F we  denote  by  P(x)  the  set  of  active  constraints^  l.e. 

P(x)  - <k  e P:  fk(x)  - 0). 

In  order  to  calculate  an  optimal  solution  x*  of  (P)  one  can  apply 
a feasible  direction  methcni  (e.g.  [13] , [22 J, [23],[2A],[23]) . first 
we  recall  that  a vector  d t Rn  is  a feasible  direction  at  x e f if 
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1 a > 0 such  that  x +«d  f F for  all  0 s a s a . 

A feasible  direction  method  Iterates  as  follows:  From  a point  x*  f F, 
use  a mapping  T to  determine  a feasible  direction  d*.  Then  apply  a 
mapping  M to  minimize  the  objective  function  f°  on  a segment  of  the 
ray  emanating  from  x'-  in  the  direction  d\  Thus  a new  feasible  point 

x*+1  - MTx*  - M(x*  + ad*)  - x*  + a^d* 

is  obtained,  where  the  etep-8ize  a ^ is  a solution  of  the  (one- 
dimensional) problem 

<S,£)  Min  {f°(x*  + ad*):  a s 0,  x*  + ad*  t F}. 

I 

The  direction  d is  typically  chosen  to  be  a direction  of  descent,  i.e. 

I / 

d satisfies  (for  a nonoptimal  x ) 

(d*)CVf°(x*)  < 0 . 

In  Zoutendijk's  classical  method  a feasible  direction  of  descent  at  x e F 
is  found  by  solving  the  linear  program 

(Z) 

Max  X 


dCVfk(x)  + \ s,  0,  k e {0}  U P(x) 

Mjl  ^ f “ 1, . . . ,n. 


s.t. 
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If  an  optiaal  solution  of  (Z)  is  (dz,  Is)  and  1*  > 0 , then  dc  is  a 
feasible  direction  of  descent. 

There  are  several  variants  of  (Z)  (see  e.g.  { 23,  p.68]  and  (13,  p. 
2461)  . We  will  call  (Z)  and  its  variants,  the  Zoutendijk  method  (abbre- 
viated: Z-method).  The  convergence  properties  and  problems  of  "jamming" 

(or  "zig-zagging")  of  the  Z-method  have  been  widely  discussed  in  the  liter- 
ature (e.g.  [ 13], ( 22 ], [ 23],  [ 24] , (2S ] ) . Many  currently  used  methods  for 
solving  convex  programs  (e.g.  (13],  (16], (19], (21 ], (25]  ) are  based  on  the 
classical  Fritz  John  or  Kuhn-Tucker  theories  (e.g.  (11],  (14]).  Whenever 
these  theories  fail  to  characterize  optimal  solutions  of  (P) , such  methods 
generally  fail  to  produce  an  optimal  solution  or  have  a pathological  be- 
haviour. Let  us  first  discuss  briefly  the  situation  when  the  Kuhn-Tucker 
theory  fails;  the  case  when  it  does  not  fail  will  be  treated  later.  We 
will  discuss  in  this  paper  only  the  Zoutendijk  method.  However,  the  basic 
points  of  the  discussion  apply  also  to  the  above  mentioned  methods. 


(i)  We  pose  the  following  question:  Does  the  Z-method  generally 
solve  problem  (P),  with  nonlinear  constraints,  in  the  absence  of  Slater's 
condition: 

3 x such  that  f*(x)  <0,  kf  P. 


(1) 
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The  answer  to  this  question  is  negative.  If  (1)  i8  not  satisfied,  then 
at  a nonop tioal  feasible  point  x*.  (Z)  mav  give  X*  - 0 and  the  method 

terminates  at  the  nonoptimal  x*.  This  is  illustrated  by  the  following 
example. 

Example  1 . Consider  the  problem 

Min  f°(x)  « Xj  + x2  + x3 

a.t. 

fJ(x)  - x2  + x^  - 2 s 0 

f2(x)  - (Xj-2)2  + (x2-2)2  -2*0 

3 2 

f (x)  - x3  - 2x3  £ 0 

X * (x^,X2,X3)  . 

The  feasible  set  is 


and  the  unique  optimal  solution  is  x*  ■ (1,1,0)*. 

Let  the  initial  approximation  be  x°  - (1,1,1) c.  Problem  (Z)  is 

here 


Max  X 

a.t. 

dj  + d2  ♦ dj  + ^ S 0 
2dj  + 2d2  + * < 0 

-2dx  - 2d2  ♦ X s 0 

|djJ  si,  1 * 1,2,3. 

The  optimal  value  is  X*  • 0 and  an  optimal  solution,  produced  by  the 
simplex  method  for  linear  progressing,  is  dz  ” 0,  A*  ■ 0.  Thus  the  next 
approximation  la 

*l  - x°  + ad*  - x° 

and  the  algorithm  terminates  at  x°,  a nonoptlmal  point. 

Let  us  note  that  among  Infinitely  many  optimal  solutions  of  problem 
(Z)  in  this  example  there  ta  also  an  optimal  solution  with  dj  < 0,  which 
generates  d*  pointing  In  the  right  direction.  However,  this  solution 
Is  not  generally  produced  by  the  simplex  method,  so  the  Z-method  generally 


falls 


d 


. 
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In  spite  of  the  popular  belief,  the  situations  in  which  the  Kuhn- Tucker 
theory  does  not  characterize  optimality  are  numerous  and  they  appear  naturally. 
The  areas  of  convex  programming  in  which  Slater's  condition  is  never  satisfied 
and  the  optimal  solutions  are  not  generally  the  Kuhn-Tucker  points  include 
mathematical  programming  formulations  of  Pareto  optimization  (e.g.[  2 ],[  4 ]) , 
lexicographic  multicriteria  decision  making  problems  (e.g.  [12]),  Chebyshev 
solutions  of  inconsistent  programs  (e.g.  [17]),  programs  with  an  ordered  set 
of  constraints  (e.g.  [ 7])  and  theory  of  games  with  information  exchange  (e.g. 
[7]).  Numerical  methods  presented  in  Section  4 can  solve  (at  least  in  prin- 
ciple) such  programs  together  with  those  for  which  a constraint  qualification 
is  satisfied. 

(ii)  Let  us  now  consider  the  problems  for  which  Slater's  condition 
is  satisfied.  The  optimal  solutions  of  problem  (P)  are  now  the  Kuhn-Tucker 
points  and  they  are  obtainable  by  the  Z-method.  However,  the  Z-method  pro- 
duces at  each  step  only  one  feasible  direction  of  decrease,  always  pointing 
to  the  interior  of  the  feasible  set  F . This  may  result  in  a poor  conver- 
gence, especially  in  the  case  of  sparse  nonlinear  constraints  of  problem  (P) . 
One  such  situation  will  now  be  demonstrated. 

Example  2.  Consider  the  program 

Min  f®(x)  - Xj  + X£ 

s.  t. 

,1,  . 2 2 2 

f (x)  * -x1  - J x2  + 5 x2  s 0 

f 2 (x)  - X2  - 1 SO 

f3(x)  - (x2  - l)2  - 1 SO. 


k n 

Definition  1.  Let  f : R -*  R.  Let  [ k]  c { 1, . . . ,n)  denote  the  index  set  of  the 
variables  lx^)  on  which  f*1  actually  depends: 

[k]  - (J:  There  exist  Xj  - i * J , such  that  the  function 

f ^ j ^ is  not  a constant.}. 

For  any  x ( Rn  the  subvector  is  obtained  by  deleting  the  components 

{xj:  j 4 [k]).  The  restriction  f^  of  f**  is  the  function 
f(k}.  Rcard[k]  ^ R obtained  by  restricting  fk  to  Xjrj. 

Example  3 . Consider  the  constraint  functions  of  the  problem  from 
Example  1, 


f1(x)  - xj  + Xj  -2 

f2(x)  - (xj-2)2  + (x2-2)2  - 2 
f3(x)  » x2  - 2x3 

x ■ (x2,x2»Xj)  . 

All  these  functions  are  considered,  in  the  problem,  as  functions  defined 

3 12 

on  R . However,  f and  f actually  depend  only  on  Xj  and  x2 

12  3 

(i.e.  f and  f are  constant  with  respect  to  x^),  while  f actually 

3 

depends  only  on  x^  (i.e.  f is  constant  with  respect  to  x^  and  x2). 

Therefore 

U1  - [2]  - {1,2}  and  (3)  - {3}. 


Furthermore,  for  k - 1,2,3,  the  aubvector  of  x consisting  only  of  the 


Her* 


v 

actual  variables  of  f la  denoted  by  Xj^j 

Xll]  " x[2]  * (x^)  and  Xt3]  * (x3>‘ 

k k 

The  restriction  of  f , k «■  1,2,3  is  the  function  f considered  only 
as  the  function  acting  on  the  subspace  detemlned  by  the  actual  variables. 

k 

In  this  example  all  three  functions  f , k - 1,2,3  are  convex,  but  neither 
is  strictly  convex.  However,  all  three  functions  considered  as  functions 
of  their  actual  variables,  l.e.  all  the  restrictions  f^,  k - 1,2,3 
are  strictly  convex.  (We  recall  that  a function  fs  R*  + R is  strictly 
convex  if 

f(Xx  + (l-X)y)  < Xf (x)  + (l-X)f(y) 

for  all  x,y<  R® , x * y,  0 < X < l.  Note  that  f(Xj)  - x*  is  strictly 

convex  when  considered  as  f:  R -*■  R,  but  it  is  not,  when  considered  as 

2 2 
f:  R R,  i.e.  when  ffa^.Xj)  ■ x^.) 

The  set  of  all  functions  f^:  Rn  > R with  strictly  convex  restrictions 
fkl 

f is  larger  (by  inclusion)  than  the  set  of  all  strictly  convex 

k n 

functions  f : R R.  The  following  theorem  is  stated  for  problem  (P) 
whose  constraints  have  strictly  convex  restrictions. 

THEOREM  1.  Let  (f*S  k e {0}  P)  be  differentiable  convex  functions: 

R°  R*  * be  a feasible  solution  of  problem  (P)  and  for  all  k e P(x*) 

fk] 

let  the  functions  f1  be  strictly  convex . For  a given  subset  O of 


- 12 


P(x*)  let  the  (n+1 )-tuple  (X(n),d(n>)  he  an  optimal  solution  of  the 
linear  program 
(l,  fl> 

Max  X 


a.c. 


(2) 


dt7f°(x*)  + X s 0 


(3) 


d^fV)  + x s o,  k e a 


(A) 

(5) 


d[k]  " °*  k * ^ * P(x*)\a 


Nil  s X * i " 1 » • • • » n . 


Then 

* , 

(a)  x ts  an  optima l solution  of  (P)  if,  and  only  if,  for  every 

ac  p(X*),  x<n>  - o; 

(b)  a veotor  3 e Rn  is  a feasible  direction  of  descent  at  x* 
if,  and  only  if,  there  exist  a subset  of  P(x*)  and  X > 0 such  that 
(X,d)  is  a feasible  solution  of  (L,J2)  . 


Proof.  It  has  been  shown  In  f i.  Corollary  2]  that  x*  la  an  optimal 
solution  of  problem  (P),  under  the  assumptions  of  Theorem  1,  If,  and  only 
If,  for  every  subset  a of  P(x*)  the  system 


<P»«> 

dCVf°(x*)  < 0 
dl7fk(x*>  <0, 

d(k]  - o,  ken* 
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Is  Inconsistent.  (The  proof  of  this  result  is  long  and  therefore  It  is 
not  repeated  here.)  But  the  inconsistency  of  (P,iJ)  is  equivalent  to 
X*  ■ 0 being  the  optimal  value  of  the  linear  programing  program 

Max  X 

Sate 

dtVf°(x*)  + X s 0 
dc7fk(x*)  + X s 0,  k e ft 

d[k]  " °*  kf 

The  normalization  condition  ( 5 ) can  be  added  to  the  above  program  without 

changing  its  optimal  value.  This  proves  the  statement  (a).  Assume  now 

that  there  exists  & c P(x*)  and  K > 0 such  that  (K,d)  is  a feasible 

solution  of  (L,^).  Then  (P,-^)  holds  for  d ■ d,  implying  that  d is 

* — 

a feasible  direction  of  descent  at  x . Conversely,  if  d is  a feasible 

* 

direction  of  descent  at  x , then  the  system 

^ dSf^x*)  < 0 
aH^Cx*)  SO,  k<  P(x*) 

for 

fl  of  P(x  ) such 
(X,d) , with  X > 0, 

(b)  is  proved. 

o 

L id 


£ 

I with  equality  only  for  those  k e P(x  ) 
^ which  ■ 0 

is  consistent.  This  Implies  the  existence  of  a subset 
that  (P,ft)  is  consistent  and  further  the  existence  of 
which  is  a feasible  solution  of  (L,H)  . The  statement 
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Theorem  1 is  important  for  the  following  reasons: 

(i)  It  gives  a characterization  of  optimality  without  any  constraint 

qualification  assumption.  In  particular,  using  Theorem  1,  one  can  treat 

the  lexicographic  and  Pareto  optimization  problems  and  establish  whether 

* 

a feasible  solution  x is  optimal. 

(ii)  It  explains  why  the  Z-method  generally  terminates  at  a nonoptimal 
point  in  the  absence  of  Slater's  condition.  If  a feasible  point  x * is 
not  optimal,  then,  by  the  part  (a)  of  Theorem  1,  there  exists  a subset 

of  P(x*)  such  that  the  optimal  value  of  the  corresponding  linear  program 
(L,P)  is  positive.  This  P-  can  be  any  subset  of  P(x*) , not  necessarily 
P(x*),  unless  Slater's  condition  is  satisfied. 

(ill)  It  explains  why  the  Z-method,  in  some  situations,  produces  a slow 

convergence  even  in  the  presence  of  Slater's  condition.  The  reason  is  in 

* 

the  following:  If  x is  a nonoptimal  feasible  point,  then  the  optimal 
value  of  the  linear  program  (L,P(x  ))  is  positive  and  one  obtains  a 
feasible  direction  of  decrease  d*.  However,  there  are,  generally,  still 
other  subsets  P of  P(x*)  for  which  the  optimal  solution  of  (L,Q) 
is  positive  and  which  also  generate  (different)  feasible  directions  of 
decrease.  In  general,  there  are  many  such  directions  (at  most  2card  ^) 
and  dz,  being  only  one  of  many,  is  not  generally  the  best.  For  Instance, 
suppose  that  at  a feasible  point  x , two  subsets,  say  and  il2  * P(x*) , 

have  the  property  that  the  optimal  values  X^  and  X2  of  (L.Pj)  and 
(L.ftj)  , respectively,  are  positive,  in  which  case  the  two  directions 

| ] 

__  i ijk 


d1  and  d^  ■ d*  are  obtained.  Nov,  if  one  chooses  for  the  feasible 
direction  of  descent  d , if  X^  > X^,  or  d , if  X^  s Xj,  ha  gets 
locally  a better  feasible  direction  of  decrease  than,  or  at  least  as 
good  as,  by  applying  the  Z-method  (which  produces  only  one  direction 
a - dz).  One  expects  to  obtain  even  better  feasible  direction  of 
decrease  if  three,  rather  than  two,  subsets  are  considered  at  x*.  In 
general,  the  higher  is  the  number  of  subsets  ft  considered,  the  better 
is  the  direction  of  descent.  If  all  subsets  are  considered,  one  gets 
locally  the  best  feasible  direction  of  descent,  by  the  statement  (b)  of 
Theorem  1. 

In  view  of  the  above  remarks,  it  is  clear  that  one  can  use 

Theorem  1 to  formulate  a new  class  of  feasible  direction  methods.  These 

new  feasible  direction  methods  solve  problem  (P)  without  any  reference 

l 

to  the  Kuhn-Tucker  theory.  If,  at  each  iteration  x , one  considers 
k linear  programming  problems  (L,  ftj),  (L,.^^),  •••»  (L.ft^), 
ft^  C P(x^),  i ■ l,2,...,k,  which  have  the  optimal  values 
Xj.Xj, . . . .X^  all  positive,  and  if  the  feasible  direction  of  decrease 
is  chosen  to  be  the  one  generated  by  the  linear  program  having  the 
largest  optimal  value,  then  we  say  that  the  feasible  direction  method 
is  of  order  k.  The  subsets  ft  will  be  considered  in  the  decreasing 
order,  l.e.  the  one  with  the  largest  cardinality  is  considered  first, 
then  one  with  the  second  largest  cardinality  is  considered,  etc. 

Note  that,  whenever  Slater's  condition  is  satisfied,  the  feasible 
direction  of  order  k • 1 is  exactly  the  Z-method.  (If  Slater's 
condition  is  satisfied,  ft*  P(x^)  is  the  largest  subset  of  P(x^) 
and  X(p(x^))  ■ X*  > 0.) 
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; 


Let  us  revisit  Examples  1 and  2.  Theorem  1 will  now  be  applied  to 
the  two  problems  in  order  to  illustrate  the  points  (11)  and  (ill).  New 
feasible  direction  methods,  which  are  based  on  Theorem  1,  will  be 
formally  Introduced  and  studied  in  Section  4. 

Example  4 . The  Z-method  has  terminated  in  Example  1 at  the  nonoptimal 
point  x°  ■ (1,1,1)  . This  has  happened  because  Slater's  condition  is 
not  satisfied  and  the  optimal  value  of  (L ,P(x°))  is  zero.  However, 
by  Theorem  1,  we  know  that  there  exists  a subset  SI  of  P(x°)  such 
that  the  optimal  value  of  (L,  is  positive.  Indeed,  for  £1  • $ 

(the  empty  set), 

(L,0) 

max  X 

s.t. 

dj,  + dj  + d^  + X sO 


-i  s dj  s 1,  i ■ 1,2,3 

has  the  unique  optimal  solution  A • 1,  d^  « d^  * 0,  d^  ■ -1  . 

The  corresponding  direction  d • (0,0-1) C is  feasible  and  it  points 
towards  the  optimal  solution  x*  » (1,1,0) C . 

Example  5.  Consider  again  the  program  Introduced  in  Example  2.  Starting 
from  x°  - Os.O)*  and  specifying  0 - 0 , the  program  (L,0)  gives  the  feasible 
direction  d°  - (-1,0>C,  pointing  towards  the  optimal  solution  x*  » 0 . Thus 
the  program  is  solved  in  only  one  iteration. 
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If  Slater's  condition  holds  for  problem  (P)  then  the  family  of 
linear  programs  ((L.Si):  SI  c P(x*)}  in  Theorem  1 can  be  replaced  by 
the  single  program  (L,P(x*)).  This  is  a well-known  result,  which  is 
stated  below  for  the  sake  of  completeness. 

Proposition  1.  Let  (f  : k f (0)  U P)  be  differentiable  convex  functions: 

Rn  * R and  x*  be  a feasible  solution  of  problem  (P).  If  Slater's 
condition  holds  for  problem  (P),  then  x*  is  an  optimal  solution  of 
(P)  if,  and  only  if,  the  optimal  value  of  the  linear  program 

(I,,P(x*)) 

Max  X 

s.t. 

dtVf°(x*)  + X s 0 
dC7fk(x*>  + X 5 0,  k e P(x*) 

|d| | £ 1*  i ■ 1, . . . ,n 

is  zero. 

In  order  to  check  whether  Slater's  condition  holds  one  can  use  the 
following  proposition. 

Proposition  2.  Slater's  condition  holds  for  problem  (P)  if,  and 
only  if,  for  an  arbitrary  fixed  feasible  point  x e F,  the  optimal 
value  X*  of  the  linear  program 

(CQ) 

Max  X 

s.t. 

d^f^x)  + X s 0,  k e P(x) 

|d±l  S 1,  i » l,...,o 

is  positive. 

, ,J 


4.  THE  METHOD  OF  ELIMINATION  OF  LINEAR  PROGRAMS 


We  will  first  study  the  method  in  which  a direction  of  decrease 
is  determined  by  solving  only  one  linear  program.  This  method  is  termed 
the  Method  of  Elimination  of  Linear  Programs  of  First  Order  (abbreviated: 
MELP  1). 


If  Slater's  condition  is  satisfied,  then  MELP  1 coincides  with 
the  Z-method.  If  Slater's  condition  does  not  hold  for  problem  (P),  then 

X*  » 0 is  the  optimal  value  of  (CQ) , by  Proposition  2.  This  implies 

that  X*  « 0 is  also  the  optimal  value  of  problem  (Z).  But  (Z)  is 

exactly  (L,P(x)).  This  means  that,  if  Slater's  condition  does  not  hold, 

X(P(X))  * 0 for  every  x e F.  Therefore,  in  order  to  test  optimality 

of  x f F using  Theorem  1,  only  proper  subsets  £2  of  P(x  ) need 

checking.  This  calls  for  solving  2carc*  ^-1  linear  programs  (at  most). 

However,  in  many  cases  one  can  establish  the  zero  optimal  value  of  the 

programs  (L,  ft)  without  actually  solving  these  programs.  In  order  to 

demonstrate  this  assertion,  recall  that  in  case  of  programs  with  strictly 

convex  restrictions  the  program  (L,iJ)  has  the  zero  optimal  value  if, and  only 

if,  the  system  (P,si)  is  inconsistent.  Whenever  the  components  of  d which 

correspond  to  the  nonzero  components  of  Vf^(x  ) or  Vf^(x  ),  for  at  least  one 

* 

index  k c 0,  are  annihilated  by  the  requirement  Ikl  " °*  k e ll  , then  the  system 
(P,  ft)  contains  the  contradictory  statement  0 < 0,  i.e.  the  system 
(P,  ft)  is  inconsistent  and  the  program  (L,  ft)  has  the  optimal  value 
zero.  If  this  happens  for  a particular  subset  ft  of  P(x*),  we  say 
that  the  program  (L,  ft)  is  eliminated.  The  above  condition  can  be 


formulated  as  follows: 


Proposition  3.  If  for  a given  proper  subset  SI  of  P(x*)  the 
elirrination  condition: 


l There  exists  k e {0>  u fl  such  that 


(6) 


m 


is  satisfied , then  X(S2)-  0.  In  fact,  if  C 6)  is  satisfied  for  k ■ kQ, 
then  X(ft)  - 0 for  every  nonempty  subset  SI  of  P(x*)  suoh  that 


(7) 


kQ  e ({0>  u ft)  c ({0}  u SI). 


Proof.  First  not*  that  the  constraint  X a 0 may  be  added  to  the 
constraints  of  (L,  SI)  because  X - 0,  d ■=  0 is  a feasible  solution. 
If  ( 6)  holds  for  k - kQ,  then  the  elimination  condition  and  the 
conditions  ( 7 ) Imply  that 


«•  kn  * 

d*Vf  °(x  ) - 0 


and  hence  (3),  or  (2),  becomes  X s 0.  Therefore  the  optimal  value  of 
CL,  12)  Is  clearly  X(t2)  - 0.  Note  that 


({0}  U S2)  C ({0}  U SI) 


SIC  SI  . 


But 


CtCSl~S?OSf»  U [j]  :>  U [j] 
jet?  jet? 


and  hence  the  elimination  condition  holds  for  ft  . Therefore  X(ft)»  0. 
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In  general,  the  "fuller"  la  the  problem  (I.e.  the  more  actual 
variables  appear  in  the  objective  function  and  constraints  of  (P) ) the 
more  programs  (L,  are  eliminated.  If  the  elimination  condition 
holds  for  all  subsets  of  P(x*) , then  x*  is  optimal,  by  the  part  (a) 
of  Theorem  1.  Let  us  also  note  that  the  idea  of  elimination  of  linear 
programs  is  meaningful  for  problem  (P)  even  if  Slater's  condition  does 
hold.  The  elimination  condition  (6)  will  now  be  illustrated  by  an 
example . 


Example  6.  Consider 


min  f (x) 


+ e 


s.t. 


1 * 1 *0  o X, 

f (x)  - e 1 + e 2 + e 3 + e 4 


f2(x)  - e 1 


2 ~X4 

+ (l-x,r  + e 


- 4 S 0 

- 3 s 0 


f3(x)  - xj  + e 2 


+ (l+x,r  - 2 s 0 


f4(x)  - x?  + e"2  ♦ e'3  + e"X4  -3*0 


5 _X1  X2  X1 

f5(x)  - e X+  e 2 + 2e  3 


5 s 0 . 


x “0.  Then  P(x  ) ■ {1,2, 3, 4).  The  number  of  proper  subsets 
{He  P(x  )}  is  24  - 1 • 15.  Consider,  for  Instance,  II-  {1}.  Then 
n ■ {2,3,4}  and  the  system 
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- d4  < 0 


(dt7f°(x*)  < 0) 


dx  + d2  - d3  + d4  < 0 


(dtVf1(x*)  < 0) 


dj  * d3  * d4  ■ 0 


(d[2,  - 0) 


d2  - d4  - 0 


<d[3]  " 0) 


d2  " d3  " d4  “ 0 


(d(4]  - 0)  . 


Is  clearly  Inconsistent.  The  program  (L,{1})  is  thus  eliminated. 
Instead  of  writing  down  all  the  systems  { (P , r2) : Q,  C P(x*)}  to  be 
checked  for  elimination,  one  can  associate  with  the  above  problem  its 
inoidenae  matrix  A - (akj)»  the  components  of  which  are 


*kj  " 


0 if  - fk(x*)  - 0 

1 


l 1 otherwise  . 


Here 


d fky 

Jx, 


5-  fK(0) 


- k - 1 («  «■  {1}) 


nonbinding  constraint 
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For  Cl*  {1},  one  considers  the  above  table  and  concludes  that  the 
elimination  condition  ( 6 ) is  satisfied  for  k ■ 0 (also  for  k • 1). 
Therefore,  program  (L,{1})  is  eliminated.  Checking  of  all  the  subsets 
*1  of  P(x*)  in  this  manner  reveals  that  all  the  programs  (L, 12)  are 
eliminated,  except  three,  which  are  (L,P(x*))  , (L.rf)  and  (L,{1,3,4}). 
Thus  only  programs  (L ,P(x*))  , (L.rf)  and  (L, {1,3,4})  have  to  be  actually 
solved  in  order  to  check  whether  x*  - 0 is  optimal. 


Let  us  note  that  the  last  part  of  Proposition  3 is  quite  useful 
in  the  process  of  elimination.  For  instance,  as  soon  as  one  has 
established  that  the  program  (L,{1,2,3}>  is  eliminated,  one  can  con- 
clude that  the  programs  (L,{1,2}),  (L,{1,3}>,  (L,{2,3}),  (L,{1>), 
(L,{2})  and  (L,{3})  (but  not  (L,4)!)  are  eliminated  as  well. 
(Specify  H « (1,2,3),  £1  * {4}  and  kQ  - 0 in  Proposition  3. 

Relation  ( 7 ) becomes  0 e ({0}  U SI)  C {0,1, 2, 3}  which  is  satisfied 
for  nonempty  ft  equal  to  {1,2},  11,3},  {2,3},  {1},  {2}  or  {3}.) 
This  suggests  that  in  the  process  of  elimination  one  should  start  with 
subsets  of  cardinality  q ■ P(x  ) and  proceed  to  subsets  of  smaller 
cardinality. 


If  Slater's  condition  does  not  hold  then  the  MELP  1 Iterates  as 
follows,  given  a current  point  x^  e F: 


(a)  Let  ■ P(x) . 
o 


. n . 


K 


(b)  Check  the  elimination  condition  ( 6 ) on  all  subsets  ft  of  P(x; , 


for  which  it  is  not  established  that  X(ft)  ■ 0,  ft  * ftj,  and  for 


which  card  ft  £ card  ftQ,  starting  with  a subset  ft  of  maximal 


cardinality.  If  an  ft  is  not  eliminated,  go  to  (c) . If  all  ft's 

L 


are  eliminated,  x is  optimal. 


(O 


Set  ft  ■ ft  and  solve  (L,  ft  ).  If  X(ft  ) > 0,  use  d^  • d(ft  ) 


as  a direction  of  descent.  If  X(ftQ)  - 0 and  ftQ  * 0,  go  to 


(b).  If  X(ft  ) = 0 and  ft 
o o 


0,  x is  optimal. 


The  MELP  of  Second  Order  (abbreviated  MELP  2)  determines  at  every  point  two 

linear  programs,  say  (L,$i^)  and  (L,ft2)  with  nonzero  solutions.  If  (X^.d1)  and 


(X^.d  ; are  the  nonzero  solutions  of  these  programs,  then  the  feasible 


l 


direction  of  decrease  d is  chosen  as  follows: 


If  X,  * X2 


otherwise 


In  view  of  Proposition  3,  it  is  suggested  that  linear  programs  (L,  ft) 


with  the  highest  cardinality  of  ft  be  first  considered.  Once  d^  is 


determined,  one  solves  the  one-dimensional  step-size  problem  (S,l)  and 

£+1 


obtains  a new  point  x 


Remarks. 


12  1 
(i)  Let  us  note  that,  d and  d depend  on  ft,  i.e.  d is 


2 

generated  by  a subset  ft^,  while  d is  generated  by  another  subset 


At  every  iteration  x , the  subsets  ft^  and  ft„  are  generally 


J 
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different.  However,  If  Slater's  condition  le  satisfied,  one  of  these 
two  subsets  Is  always  Q*  P(x^).  (This  is  a consequence  of  Proposition  1: 
If  x la  not  optimal,  then  (L,P(x  ))  has  an  optimal  solution 
(X*,dr)  with  X1  > 0 and  d*  * 0.  Since  SI  • P(x^)  has  the  highest 
cardinality  of  all  the  subseta  of  P(x^),  (L,P(x^))  la  solved  and  d* 

la  picked  as  one  of  the  two  candidates  for  d^.)  Thus,  if  Slater's 
condition  holds,  the  feasible  direction  of  decrease  d^  Is  chosen  as 
follows: 

dz  If  X*  i X„ 

d - 2 

d otherwise. 

2 

Here  (X2,d  ) Is  an  optimal  solution  of  (L,  '1)  with  <1  having  the 
highest  cardinality  among  all  proper  subsets  of  P(x^)  and  such  that 
d2(«)  • 0 in  (X(il),d2(i$). 

(11)  If,  at  x*  f F,  all  (L,  Sl)'s  are  eliminated  or  havr  optimal 
solutions  (X(  (l),d(n))  with  d(  U)  ■ 0,  except  one,  then  MF.LP  2 
coincides  with  MELP  1 at  this  particular  point. 

The  MELP  of  Third  Order  (abbreviated  MELP  3)  chooses  at  every 

x^  t F the  feasible  direction  of  decrease  d^,  by  comparing  three 

12  3 

nonzero  solutions  (Xj,d  ),  (Xj.d  ) and  (X^,d  ).  The  one  with  maximal 
X determines  d^.  For  instance,  If  X^  » max  (X^.X^.X^),  then 


i 


I 


It  is  obvious  that  one  can  define  the  MELP  of  a higher  order,  as 
well.  The  "best"  feasible  direction  of  decrease  is  the  one  chosen  after 
solving  and  comparing  all  (nonel iminated)  programs  (L,fl  ) with  noncero 
optimal  solutions,  tn  Section  5 it  will  be  shown  how  th  "best"  feasible 
direction  of  decrease  can  ho  determined  at  x^  c F by  a different,  more 
practical  approach. 

The  MELP  (in  particular  MELP  2)  has  produced  good  numerical  results 
for  sparse  convex  programs  with  constraints  which  have  strictly  convex 
restrictions.  In  case  of  the  general  convex  program  Che  MELP  is  much 
more  involved  and  its  computer  implementation  is  not  yet  finalised. 
Nevertheless,  for  the  sake  of  completness,  and  as  a suggestion  for  possible 
future  research,  the  MELP  for  general  convex  programs  will  now  be  briefly 
described . 

First  we  recall  the  following  concept  (e.g.  (3])  . 

Definition  2.  Let  f:  Rn  -*  K and  x*  e Rn.  Then 

Df(x*>  ^(df  Rn:3  o > 0 ? f(x*-k»d)  - f (x*) , Vae  [0,£ J } 
ia  the  cons  of  directions  of  constancy  of  f at  x*. 

If  f la  a convex  differentiable  function  then  Df(x*)  la  a 
convex  (but  not  necesaarlly  cloaed)  cone.  In  general,  the  cone  of 
dlrectlona  of  constancy  can  be  quite  complicated.  However,  if  the 
function  f is  strictly  convex,  strictly  convex  in  its  actual  variables 
or  linear,  this  cone  is  quite  simple,  as  shown  by  the  following  example. 


Example  7.  If  f:  Rn  R is  strictly  convex,  then  at  every  x*  e Rn, 
Df(x*)  - {0}. 

If  f:  Rn  R is  strictly  convex  in  its  actual  variables  x^,,,,,^ 

(k  s n),  then  at  every  x*  e Rn, 


Df(x  ) - 


0 

e 

> 

e 

0 

• 

dk+l 

J d^,  i - k+l,...,n  arbitrary 

s. 

A 1 

> 

3 2 1 

For  instance,  consider  f : R -*•  R and  f : R -*•  R defined  by 


1 2 2 2 ~X1 
fA  - xj  + - 2,  l*  - e - 1. 


Here 


and 


D ,(x*) 
f1 


0 J:  dj  arbitrary 

d. 


D ,(x  ) - 
f* 


5 ^l*1^  »rbltr*fy 


If  fs  Rn  -*•  R is  a linear  form  f(x)  - alx  d,  where  a e Rn 
e e R,  then  at  every  x*  < Rn, 


and 
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Df(x  )-N(aC), 

the  null-space  of  a(.  More  general,  a function  f:  Rn  •+  R of  the  form 
f(x)  • <KAx+b)  + xCa  + 8 

where  ♦ is  a strictly  convex  function:  Rm  -*  R , A is  an  mxn  matrix, 
b e Rm,  a c R°  and  6 e R has 

Df(x*)  - {d  c R°:  Ad  - 0,  dCa  - 0}  . 

These  functions  are  called  "faithfully  convex"  by  Rockefeller.  A class 
of  such  functions  is  rather  large  and  it  Includes  linear,  strictly  convex, 
quadratic  and  in  fact  all  analytic  convex  functions. 

The  MELP  is  based  on  the  following  characterization  of  optimality 
stated  for  general  convex  program  (P) . 

THEOREM  2.  Let  { fk:  k e {0  } UP}  be  differentiable  convex  functions: 
Rn  ■*  R and  x be  a feasible  solution  of  problem  (P) . For  a given 
subset  n of  P(x  ) let  the  (n+1 )-tuple  (X(ft),d(ft))  be  cm  optimal 
solution  of  the  linear  program  over  a cone 

(c,n> 

max  X 

s.t. 

dt7f°(x*)  + X s 0 
dCVfk(x*)  + X s 0,  k«ll 

d e Dfk<x*)(  k € «*  ; IdJ  S 1,  i - l,...,n. 


(a)  x ia  an  optimal  solution  of  (P)  if,  and  only  if,  for  every 
n C P(x*),  X(tt)  - 0; 


(b)  a vector  d e Rn  ia  a feasible  direction  of  descent  at  x*  if, 

and  only  if,  there  exists  a subset  $ of  P(x*)  and  X > 0 such 
that  (X  , d)  is  a feasible  solution  of  (C,fl) . 

* 

Proof.  Point  x e F is  an  optimal  solution  of  problem  (P)  if,  and 
only  if,  for  every  subset  0 of  P(x*)  the  system 

dtVf°(x*)  < 0 
dCVfk(x*)  <0,  k € fi 

(1  e D ,(x*),  kej]* 
iK 

is  inconsistent.  (Proof  of  this  statement  can  be  found  in  [3,  Theorem  1]  .) 
Theorem  2 is  now  proved  in  the  same  way  as  Theorem  1,  the  only  difference 
being  that  "dj R j - 0"  is  replaced  by  "d  e D k(x*)"  . 

□ 

Note  that  Theorem  1 Is  a special  case  of  Theorem  2,  since  in  the  case  of 
constraint  functions  f , k c P,  which  are  strictly  convex  in  their 
actual  variables, 

d . 0fk(x*)-d,k| 


- 0. 
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The  MELP  method  (of  order  k ) is  formulated  in  the  same  way  as  before. 
The  only  difference  is  that  instead  of  solving  (or  eliminating)  linear 
programs  (L,ft)  one  has  to  solve  (or  eliminate)  linear  programs  over 
cones  (C,ft)  . 

The  main  difficulty  in  implementing  the  MELP  in  the  general  case 

A A 

is  that  the  cones  D . (x  ),  k € ft  have  to  be  calculated  at  every 

f 

iteration.  However,  in  many  situations,  as  shown  by  Example  7,  these 
cones  are  readily  available. 

5.  THE  PARAMETRIC  FEASIBLE  DIRECTION  METHOD 

The  method  presented  in  this  section  is  based  on  the  parametric 
characterization  of  optimality  given  in  [1,  Theorem  3]  . Its  main 
features  are  that  it  is  also  capable  of  generating  directions  along  the 
boundary  of  the  feasible  region  and  it  consistently  gives  a feasible 
direction  of  steeper  descent  than  the  direction  generated  by  the  Z-method. 
This  is  achieved  by  solving  only  one  linear  program  at  each  iteration. 
Numerical  experience  indicates  that,  in  spite  of  the  additional  work  per 
iteration,  the  parametric  feasible  direction  method  (abbreviated  PFDM)  is 
overall  superior  to  the  Z-method  (especially  for  large  problems)  since 
the  number  of  iterations,  required  to  achieve  a desired  accuracy,  is  much 
smaller.  The  method  is  also  capable  of  solving  large  sparse  convex 
programs . _ . 


The  PFDM  is  designed  to  solve  convex  differentiable  programs,  with 
faithfully  convex  constraints  (see  Example  7),  which  satisfy  Slater's  condition: 
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Min  f (x) 


fk(x>  £ hk(Akx  + bk)  + xTak  + ak  s 0,  k e P 


XTC^  + 6^0,  l e L 


where 


L £ x S U 


k \ 

h : R -*■  R is  strictly  convex 


BL  X 11  m 

\ c R , bk  € R ^ , ak  e Rn  , a e R (keP) , c^Rn, 


8^  e R , l c L,  L c Rn  , U € Rn. 

The  method  is  based  on  the  following  result. 

Theorem  3«  Let  X be  a feasible  nonoptimal  solution  of  the  convex 
program  (FC).  Then  there  exists  a sufficiently  small  positive  scalar 
0 such  that  d ■ d(B)  , the  optimal  solution  of  the  linear  program 


Min  dCVf°(x) 


dSfk(x)  + 6 (|dCak|  + S lA1  d|)  S0,k£  P(x) 

i-1  k 


|d1|  s 1,  i c P 


is  a feasible  direction  of  descent  at  x . Moreover,  0 <0^0  implies 
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We,))1  vf°(x)  s (d(e))c  ve°(x) 


j.e.  the  smaller  is  6 , the  steeper  is  descent.  Here  ^ denotes  the 
L-th  row  of  . 


Proof.  The  result  follows  Theorem  3 in  [ 1 1 after  specifying  <p  , k g P 
to  be  the  norm. 


For  the  use  of  anti-jamming  procedure  we  introduce  the  e-active 


index  sets: 


P£(x)  = {k  c P : - e S f *(x)  5 0} 


L (x)  = {■£  e L : - e s xc*"  +8  s 0 } 

f-  <L 


J£(x)  - {j  : Uj  - e s Xj  i Uj  > 


J^'.x)  = {j  : L,  S S L,  + e } 


Let  x be  a feasible  solution  of  (FC).  The  Z-method  with  the  anti- 
jamming e-active  procedure  for  problem  (FC)  uses  the  following  direction 
generating  linear  program 


Max  X 


Vf  (x)d  + X s 0,  k c {0}  u P (x) 

e 

dV  SO , l c Le(x) 


yo,  jfcJ+(x) 
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1 * dj  5 i • 1 


£ |d  I s y/2 

j-1 


no, 


For  a given  feasible  direction  d , the  step-size  generating  problem 
for  (FC)  is 

(S ,1)  Min  f°(x  + ad) 


f (x  + ad)  SO,  k e P 


a^  S a S a^ 
a 2 0 


where 


a^  - max 


' -3.-xtc^  U 

i max  , max  — « 

l*L~  dlcl  jel~ 


f -6^-xtct  U - x 

a_  - min  { min  ■ , tnin_  -J J 

[UL  dc  jcl+  d 


while 


; I UL!l  ] 

J«I+  dj  J 

min  Ll'*l  l 

JeI'  dj  3 


I - {j:  dj  > 0},  l“  « {j;  dj  < 0) 


L - {£:  dc  >0},  L~  - U:  d c < 0}  . 


Here  we  use  the  convention  that  the  maximum  over  the  empty  set  is 
- 00  and  the  minimum  over  the  empty  set  is  + » . The  bounds  ax  and 
a2  are  obtained  by  requiring  the  point  x + ad  to  be  feasible  for  the 
linear  constraints  of  (FC) . 
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Given  a solution  d2  of  (Z  ) define  the  numbers 

6 z ' k 
(d)  Vr  (x) 

(10)  °k  * " » k ePE(x) 

|(dZ)Cak|  + ^ |A>d*t 

i-1  K 


and  the  linear  program 


<LC> 


Min  dSf^x) 


s.t. 


(ID 


dtVfk(x)  + 0. 


f|(dV«V  ^l<12|)^<>. 

V i-1  7 


k e Pe(x) 


and  all  constraints  (9) , 


Proposition  4.  Let  x be  a feasible  but  nonoptimal  solution  of  (FC) , 

Z Z i 

1 > 0 i d an  optimal  solution  of  (Z^)  and  d an  optimal  solution 

(l*e)  • Then  d^  is  a feasible  direction  of  steeper  descent  than  d2, 

i.e. 

(12)  (dL)t7f°(x)  S (d2)CVf°(x>- 
Proof.  Since  Az  > 0 

(13)  (dVvfk(x)  <0,  k e Pe(x) 

by  (8).  Further,  by  the  assumption  that  fk  , k e P are  faithfully  convex, 
(dVvf\x)  - [ Vhk(Akx  + bk)J  (A^2)  + (d2)^ 

which  together  with  (13)  shows  that  the  denominator  in  0^  is  nonzero. 

Thus  0^  i8  well  defined  and  positive  for  every  k*e  Pc(x).  Hence,  by 


Theorem  3 (see  also  discussion  in  |1|),  dL  is  a feasible  direction.  Now 

the  definition  of  0k  and  the  fact  that  dz  solves  (Z£)  imply  that 

d - dz  satisfies  (9)  and  (11),  i.e.  d - dz  is  feasible  for  (L  ) and 

£ 

hence  (12)  follows. 

□ 


The  PFDM  will  now  be  formulated  for  solving  the  program  (FC) . 

Initialization.  Specify  eQ  ("6-activity  parameter")  and  e*  < e 
("stopping-rule  parameter").  Find  an  initial  feasible  solution  x°. 

Set  k " 0. 

-1-  Set  * m xk,  e«ek.  Solve  (Z£) . Let  Xz  , dZ  denote  the  solution. 
Set  X - XZ  , d - dZ. 


Step  2. 

If  P (x)  * 0 , go  to  Step  4.  Otherwise  continue. 

£ 

Step  3. 

Calculate  0k  by  formula 

(10) , solve 

(L£) . Denote 

the 

solution 

by  d 

and  set  X - dtVf°(x) . 

Step  4. 

If  X > e ',  solve  (S,£) . 

* 

Let  a be 

the  optimal  solution. 

_ k+1  * 

Set  x * x + a d and  continue. 

Otherwise, 

k+1 

set  x » x 

and 

go  to 

Step  6. 

k+1 


Step  3.  If  c £ e set  ek+1 
Go  to  Step  1. 


Otherwise  set  e, 


min  (e,X). 


r 
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k+1 


Step  6.  If  e S e stop,  x is  optimal.  Otherwise  set 


ek+1  ■ min  (e,X)  and  go  to  Step  1, 


Remarks 


(i)  The  absolute  value  variables  in  (Z£)  are  treated  by  the  transformation 


di  " di  " di  * id± i = di  + dt  ' di  2 0*  > 0 


(14) 


d+id~i  ” 0 


The  simplex  algorithm  applied  to  (Z£)  has  to  be  modified  so  that  the 
orthogonality  condition  (14)  is  satisfied.  (Such  modification  is  standard 
in  e.g.  quadratic  programming  algorithms.) 


(ii)  Introducing  the  transformation 


k Ai. 

“ ^S(d  * ^ • • • »®|( 


k .tk 

w ..“da 
mk+! 


the  constraint  (11)  becomes 

/ mk+1 

dCVfk(x)  + 6 £ |wk||s0 

\ k-1  1 


(15) 


A^d  - wk  « 0 , i“l , • . • , m^ 


,t  k k - 

da  - w . , = 0 
m,  +1 
k 


IJ 

M. 


The  absolute  values  in  this  system  are  treated  as  in  Remark  (i). 
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[( 


(iii)  The  constraints  (15),  and  also  the  absolute  values  transformation, 
increase  the  sire  of  the  linear  programming  problem  equivalent  to  (L£)  . 
Nevertheless  the  corresponding  matrix  is  sparse  and  has  a special  structure 
which  should  be  exploited  to  speed  up  calculations. 


(iv)  The  normalization  condition  -1  s dj  S 1,  J»l, 

convenient  but  may  cause  overall  slow  convergence. 

n 

about  the  condition  2 |d  | s 1.  The  combination 

.1-1  J 


. • . , n is 
The  same 


computationally 
can  be  said 


-1  * dj  * 1 , j-1 n 


S |d.  | * Si 
j-1  J 


used  in  the  algorithm  can  be  still  described  by  linear  constraints  and 


it  has  been  found  satisfactory  in  test  problems.  The  above  normalization 

n 2 

condition  is  an  approximation  of  the  normalization  2 d S 1 . 

j-1 


(v)  In  the  practical  implementation  of  the  PFDM  it  has  been  found  that 
eo  *>  0.1  is  a satisfactory  choice. 


(vi)  Numerical  experience  indicates  that,  particularly  in  the  early  stages 
of  the  algorithm,  it  is  preferable  to  solve  the  step-size  problem  not  too 
accurately. 

6.  OVERALL  COMPUTATIONAL  EXPERIENCE 


The  authors  have  Solved  Sy  the  MELP  and  PFDM,  more  than  one 
hundred  convex  programs  with  faithfully  convex  nonlinear  constraints.  The 


size  of  the  program  has  ranged  from  small  to  the  ones  with  100  variables 
and  50  nonlinear  constraints.  The  overall  experience  suggests  that  the 
HELP  2 gives  very  good  results,  particularly  for  sparse  programs  with 
constraints  functions  having  strictly  convex  constraints.  The  method  also 
seems  to  be  rather  uneffected  by  jamming.  The  PFDM  produces  excellent 
results  for  any  kind  of  faithfully  convex  constraints. 

Let  us  finally  demonstrate  the  methods  by  solving  two  nontrivial 
programs . 

Example  8.  Consider 


Min  f°(x)  “ x4  - x2  + (x3-l)2  + (x4-2)2  + (Xj-2)2 


8 • 1 # 


X, 

fA(x)  . e 1 


f2(x)  . x2 


+ x., 


. 2 . ~x3 

+ Xg  + e 


- 1 s 0 

-ISO 


r(x)  - x. 


•+  x‘  + x~  -ISO 


f (x) 


X2  ' 2x2 


s 0 


f5(x)  - (Xj-1)2  + Xj 


f°(x)  - 


-X, 


+ e 


-ISO 

-ISO 


f7(x) 


“x5 

+ e - 1 s 0, 


One  can  show,  that  at  the  feasible  point  x°  » 0,  the  optimal  value  X. 
of  the  linear  program  (CQ)  is  zero.  Therefore,  Slater's  condition  is 
not  here  satisfied,  by  Proposition  2.  The  Z-method  has  terminated  here 
at  x°  ■ 0,  a nonoptimal  point.  However,  the  MELP1  is  applicable  and, 
starting  from  the  initial  point  x°  - 0,  it  gives  the  following  results: 
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Step 


l 

l 

X1 

l 

*2 

t 

x3 

£ 

x4 

£ 

x5 

Active 

constraints 

Pivot  sec  else 

S2  *1 

,o,  £. 
f (x  ) 

0 

0 

0 

0 

0 

0 

{1,2, 4, 5, 6, 7} 

(2,6,7) 

0.70711 

9 

1 

0 

0 

0.70711 

0.70711 

0.70711 

(1,3, 4, 5) 

{3} 

0.10138 

3.42893 

2 

0 

0 

0.80849 

0.69226 

0.70711 

{1,4,5) 

0 

0.00739 

3.41843 

3 

0 

0 

0.81587 

0.6996S 

0.71449 

{1,3,4, 5) 

{3} 

0.02806 

3.37735 

4 

0 

0 

0.84393 

0.72513 

0.68643 

{1.4.5) 

0 

0.00107 

3.37513 

5 

0 

0 

0.84500 

0.72619 

0.68750 

{1,3,4,51 

{3} 

0.03226 

3.36927 

6 

0 

0 

0.87726 

0.69)93 

0.71726 

(1,4,5) 

0 

0.00142 

3.36631 

7 

0 

0 

0.87869 

0.69536 

0.71868 

11,3,4, 5) 

(3) 

0.02273 

3.35859 

8 

0 

0 

0.90141 

0.71670 

0.69596 

(1,4.5) 

0 

0.00071 

3.35710 

9 

0 

0 

0.90212 

0.71741 

0.69667 

{1. 3.4,5) 

{3} 

0.01888 

3.35329 

10 

0 

0 

0.92100 

0.69853 

0.71462 

(1,4,5) 

0 

0.00049 

3.35226 

11 

0 

0 

0.92149 

0.69901 

0.71511 

(1,3,4. 5) 

{3} 

0.01489 

3.34968 

12 

0 

0 

0.93638 

0.71332 

0.70021 

(1,4,5) 

0 

0.00032 

3.34903 

13 

0 

0 

0.93669 

0.71364 

0.70053 

U.3. 4,5) 

{3} 

0.01201 

3.34736 

14 

0 

0 

0.94871 

0.70163 

0.71216 

{1,4,5} 

0 

0.00020 

3.34694 

15 

0 

0 

0.94890 

0.70182 

0.71236 

(1.3, 4, 5) 

{3} 

0.00965 

3.34590 

lb 

0 

0 

0.95855 

0.71123 

0.70271 

U.4,5) 

0 

0.00013 

3.34562 

17 

0 

0 

0.95868 

0.71135 

0.70283 

U,3,4, 5} 

(3) 

0.00779 

3.34497 

18 

0 

0 

0.96646 

0.70357 

0.71046 

(1,4,5) 

0 

0.00010 

3.34479 

19 

0 

0 

0.96655 

0.70366 

0.71055 

{1,3. 4, 5) 

{3} 

0.00629 

3.34432 

20 

0 

0 

0.97284 

0.70984 

0.70426 

(1,4,5) 

0 

0.00006 

3.34420 

21 

0 

0 

0.97290 

0.70989 

0.70431 

{1,3,4,51 

(3) 

0.00509 

3.34392 

22 

0 

0 

0.97798 

0.70481 

0.70933 

(1.4,5) 

0 

0.00004 

3.34384 

23 

0 

0 

0.97803 

0.70485 

0.70937 

(1,3, 4, 5) 

{3) 

0.00779 

3.34363 

24 

0 

0 

0.98582 

0.70485 

0.70929 

(1,4,5) 

0 

0.00004 

3.34357 

25 

0 

0 

0.98586 

0.70489 

0.70933 

3.34335 

By  the  "pivot  act  SI"  we  have  denoted,  at  every  iteration  x^, 
Che  biggest  (by  cardinality)  subnet  M of  P(x^)  which  generates  the 


i 


Since  A 


direction  of  decrease  d(S2)  + 0 (ns  a solution  of  linear  program  (L,fl)) 

l 

The  sequence  x converges  to  the  optimal  solution 


with  the  optimal  value  f(x  ) » 9 - U\T~1  . 

The  same  problem  will  now  be  solved  using  the  MELP2.  Starting 
from  Che  same  initial  approximation  x°  ■ 0,  the  following  results  are 
obtained: 


The  optimal  solution,  correct  to  five  decimal  places,  is  reached  in  only 
two  iterations!  At  the  initial  approximation  x°  ■ 0,  two  noneliminated 
subsets  of  largest  cardinality  are  - {2,6,7}  and  Sl2  m {2,6}.  Since 
the  corresponding  optimal  values  of  linear  programs  are  here  equal,  l.e. 
X(>Qp  • X(f2 * 1,  we  choose  to  be  the  pivot  set.  (Whenever  there 

is  a tie,  as  in  the  above  case,  one  can  systematically  choose  the  first 
n for  the  pivot.)  At  the  next  approximation  x1,  the  two  noneliminated 


subsets  are  (2.  _ mi  > n 


Pivot  set 


Active 

constraints 


0 

0.70711 

1.00000 


0 

0.70711 

0.70711 


0 (1,2, 4, 5, 6, 7}  {2,6,7} 

0.70711  {1,3, 4, 5}  <t> 

0.70711 


0.70711 

0.29289 


9 

3.42893 

3.34314 


0i20711  and 
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A(Qj)  * 0.58578,  we  conclude  that  X(n^)  > X(ftj)  and  choose  ft2  to  be 
the  pivot  set.  Let  us  note  that,  at  each  iteration  x^  , there  are  here 
at  most  two  nonellminated  subsets  . 

Example  9.  The  following  program  with  50  variables,  20  nonlinear  convex 
and  5 linear  constraints  has  been  solved  by  the  PFDM  and  compared  with  the 
Z -method. 

' 1 50  2 

Min  *•  2 (x  -a.) 

j-1  33 

s.t. 

exp(x1  + x2  + x^)  + 2exp(Xj  - x^  - x$)  £ 20.83 
exp(-Xg)  + exp(-Xg  + 2xj  + Xg)  + exp(x1Q  - x^)  £ 2.05 
exp(-x9  + 2x15>  + 2exp(-x12)  £ 25 
4exp(5x13  + x^7  - 2xlg)  + exp(x17>  £ 215.68 
exp(-x16)  + 3exp(-x26)  + exp(-x3g)  £ 5 
exp(-x21)  + 7exp(x^  - x^)  + exp(x22  + *2l)  £ 142.7 
exp(x5  +0.1  x19  - 2x2g)  + 2exp(-x5  + x29)  £ 3 
exp(0.4(xft  + x25)  + x27)  + 10exp(-x23>  S 10 

_2 

10  exp(x31  + x32  + x33>  + exp(x34  - x35>  + exp(-x31>  £ 5 
-3 

10  exp(0.5(x34  + x35>  + x36>  + exp(x37  + x3g)  £ 1.15 

10~3exp(x39  + x4Q  + x41  + x42  + xA3)  + exp(-x44  - x^)  £ 0.29 

exp(x48  “2x^ 7 + + exP(x^g  “ x5q^  s 2 

exp(x^2  - x16)  + exp(x12  + xlfi)  £ 10 
4 

Xg  - Xg  + exptx^)  + x13  £ -0.63 

(xu  - l)2  + (x21  - l)2  + (xn  - l)2  £ 29 
2 2 2 

xx  + x2  + x3  + 3xj3  £ 8 

2 -2 

exp(-x2  + Xg)  + (x2Q  + x3Q)  s 10 

2 ' 

(x3  + 2x4)  + exp(-Xg)  £ 10 
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(x16  + x17  + x18>2  + 10  3 exp(-Xjj)  S 9 
exp(x8)  + (x4g  - x49  + x50)4  - xg  S 629 

X3  + 3x6  + x12  “ ?X20  ” 2x30  5 0 
X1  + X7  " 3x29  ' 4x41  ’ 4x42  S “9 
x5  + x15  + x25  + x35  S 9,5 
X49  " x50  5 0 

4x»  + 3x,n  - 0.5x,_  + x«t  - 3x.,  s 7 
2 10  13  23  43 


where  a^*  are  given  below 


The  Initial  feasible  point  x is 


3 0.5 


and  chc  optimal  solution 


x is 


-9 

5 

ft 

0 3 

0 

0 

0 

0 

5 

0 

-2 

0 

-33  0 

0 

0 

0 

0 

0 

0 

0 

0 

1 0 

0 

0 

0 

0 

0 

0 

0 

0 

0 0 

0 

0 

0 

0 

0 

0 

0 

0 

0 1 

1 

0 

0 

0 

0 

0 

with  the  optimal 

O * 

value  f (x  ) 

- 149. 

005. 

One 

can 

verify 

* 

that  x 

satisfies 

the  Kuhn-Tucker  condition  with  multipliers 

S' 

X7 

• *12  ’ *18  ' 

X21  * 

X22 

“ X24 

- 1, 

X - 0 

otherwise 

• 

Value  of  objective  function 


Iteration 

PFDM 

7 
. € 

0 

574.755 

574.755 

1 

551.615 

574.755 

5 

407.631 

570.873 

10 

310.172 

440.147 

50 

149.795 

391.386 

100 

350.089 

400 

246.635 

800 

199.518 

The  MELP  and  the  PFDM  have  been  tested  on  the  IBM  360/75  computers  at  the 
Technlon  and  at  McGill  University. 


A 
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This  paper  introduces  a new  class  of  feasible  direction  methods  for 
solving  differentiable  convex  programs  with  nonlinear  convex  constraints^ 
Unlike  many  presently  used  methods/*t_he  ones  introduced  here  are  ne^basecF 
on  the  Fritz  John  or  the  Kuhn-Tucker  theory  but  rathef^on  two  recent 
characterizations  of  optimality  without  a constraint  qualification.  The  new 
methods  are  capable  of  generating  feasible  directions  of  descent  along  the 
boundary  of  the  feasible  set  and  they  consistently  give  directions  of 
steeper  descent  than  many  popular  methods.  This  is  achieved  by  solving 
only  one  linear  program  at  each  iteration.  The  new  methods  are 
particularly  useful  in  solving  large  sparse  convex  programs;  some  of  the 
programs  tested  had  100  variables  and  50  nonlinear  constraints.  Moreover, 
the  new  methods  are  applicable  whether  or  not  Slater's  condition  or  any 
other  constraint  qualification  is  satisfied.  . 
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